skip to main content


Search for: All records

Creators/Authors contains: "Botelho, Anthony F"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Prior work analyzing tutoring sessions provided evidence that highly effective tutors, through their interaction with students and their experience, can perceptively recognize incorrect processes or “bugs” when students incorrectly answer problems. Researchers have studied these tutoring interactions examining instructional approaches to address incorrect processes and observed that the format of the feedback can influence learning outcomes. In this work, we recognize the incorrect answers caused by these buggy processes as Common Wrong Answers (CWAs). We examine the ability of teachers and instructional designers to identify CWAs proactively. As teachers and instructional designers deeply understand the common approaches and mistakes students make when solving mathematical problems, we examine the feasibility of proactively identifying CWAs and generating Common Wrong Answer Feedback (CWAFs) as a formative feedback intervention for addressing student learning needs. As such, we analyze CWAFs in three sets of analyses. We first report on the accuracy of the CWAs predicted by the teachers and instructional designers on the problems across two activities. We then measure the effectiveness of the CWAFs using an intent-to-treat analysis. Finally, we explore the existence of personalization effects of the CWAFs for the students working on the two mathematics activities. 
    more » « less
  2. Abstract

    Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially.

     
    more » « less
  3. The development and application of deep learning method- ologies has grown within educational contexts in recent years. Perhaps attributable, in part, to the large amount of data that is made avail- able through the adoption of computer-based learning systems in class- rooms and larger-scale MOOC platforms, many educational researchers are leveraging a wide range of emerging deep learning approaches to study learning and student behavior in various capacities. Variations of recurrent neural networks, for example, have been used to not only pre- dict learning outcomes but also to study sequential and temporal trends in student data; it is commonly believed that they are able to learn high- dimensional representations of learning and behavioral constructs over time, such as the evolution of a students’ knowledge state while working through assigned content. Recent works, however, have started to dis- pute this belief, instead finding that it may be the model’s complexity that leads to improved performance in many prediction tasks and that these methods may not inherently learn these temporal representations through model training. In this work, we explore these claims further in the context of detectors of student affect as well as expanding on exist- ing work that explored benchmarks in knowledge tracing. Specifically, we observe how well trained models perform compared to deep learning networks where training is applied only to the output layer. While the highest results of prior works utilizing trained recurrent models are found to be superior, the application of our untrained-versions perform compa- rably well, outperforming even previous non-deep learning approaches. 
    more » « less
  4. As computer-based learning platforms have become ubiq- uitous, there is a growing need to better support teachers. Particularly in mathematics, teachers often rely on open- ended questions to assess students’ understanding. While prior works focusing on the development of automated open- ended work assessments have demonstrated their potential, many of those methods require large amounts of student data to make reliable estimates. We explore whether a prob- lem specific automated scoring model could benefit from auxiliary data collected from similar problems to address this “cold start” problem. We examine factors such as sam- ple size and the magnitude of similarity of utilized problem data. We find the use of data from similar problems not only provides benefits to improve predictive performance by in- creasing sample size, but also leads to greater overall model performance than using data solely from the original prob- lem when sample size is held constant. 
    more » « less
  5. Prior works have led to the development and application of automated assessment methods that leverage machine learning and nat- ural language processing. The performance of these methods have often been reported as being positive, but other prior works have identified aspects on which they may be improved. Particularly in the context of mathematics, the presence of non-linguistic characters and expressions have been identified to contribute to observed model error. In this paper, we build upon this prior work by observing a developed automated as- sessment model for open-response questions in mathematics. We develop a new approach which we call the “Math Term Frequency” (MTF) model to address this issue caused by the presence of non-linguistic terms and ensemble it with the previously-developed assessment model. We observe that the inclusion of this approach notably improves model performance, and present an example of practice of how error analyses can be leveraged to address model limitations. 
    more » « less
  6. null (Ed.)
  7. Open-ended questions in mathematics are commonly used by teachers to monitor and assess students’ deeper concep- tual understanding of content. Student answers to these types of questions often exhibit a combination of language, drawn diagrams and tables, and mathematical formulas and expressions that supply teachers with insight into the pro- cesses and strategies adopted by students in formulating their responses. While these student responses help to in- form teachers on their students’ progress and understand- ing, the amount of variation in these responses can make it difficult and time-consuming for teachers to manually read, assess, and provide feedback to student work. For this rea- son, there has been a growing body of research in devel- oping AI-powered tools to support teachers in this task. This work seeks to build upon this prior research by in- troducing a model that is designed to help automate the assessment of student responses to open-ended questions in mathematics through sentence-level semantic represen- tations. We find that this model outperforms previously- published benchmarks across three different metrics. With this model, we conduct an error analysis to examine char- acteristics of student responses that may be considered to further improve the method. 
    more » « less
  8. Open-ended questions in mathematics are commonly used by teachers to monitor and assess students' deeper concep- tual understanding of content. Student answers to these types of questions often exhibit a combination of language, drawn diagrams and tables, and mathematical formulas and expressions that supply teachers with insight into the pro- cesses and strategies adopted by students in formulating their responses. While these student responses help to in- form teachers on their students' progress and understand- ing, the amount of variation in these responses can make it dicult and time-consuming for teachers to manually read, assess, and provide feedback to student work. For this rea- son, there has been a growing body of research in devel- oping AI-powered tools to support teachers in this task. This work seeks to build upon this prior research by in- troducing a model that is designed to help automate the assessment of student responses to open-ended questions in mathematics through sentence-level semantic represen- tations. We nd that this model outperforms previously- published benchmarks across three di erent metrics. With this model, we conduct an error analysis to examine char- acteristics of student responses that may be considered to further improve the method. 
    more » « less